RePP Africa – a georeferenced and curated database on existing and proposed wind, solar, and hydropower plants

Promoting a transition to low-carbon energy systems to mitigate climate change requires an optimization of renewable energy (RE) planning. However, curated data for the most promising RE technologies, hydro-, wind and solar power, are missing, which limits data-based decision-making support. Here, a spatially explicit database for existing and proposed renewable power plants is provided: The Renewable Power Plant database for Africa (RePP Africa) encompasses 1074 hydro-, 1128 solar, and 276 wind power plant records. For each power plant, geographic coordinates, country, construction status, and capacity (in megawatt) are reported. The number of RePP Africa records exceeds the respective values in other existing open-access databases and matches available cumulative capacity data reported by international energy organizations best with deviations <13% for hydro-, <23% for wind, and <32% for solar power plants. This contemporary database is the most harmonized open-accessible reference source on RE power plants across Africa for stakeholders from science, (non-)governmental organizations, consulting, and industry; providing a fundamental data basis for the development of an integrated sustainable RE mix.

reporting and distinguishing the three technology types are still lacking. As a consequence, the opportunities for and impacts of these technology types on a national or continental level are poorly studied, constraining advanced hydropower energy planning 16 .
During recent years, solar and wind power have exhibited the highest growth rates among Africa's renewable energy (RE) resources, yet they still contribute marginally to Africa's energy resource mix (i.e. solar: 1.2%; wind: 1.5% share of total electricity generation in 2019 17 ). Given the dependence of solar and wind power on meteorological variables, power generation from these RE resources is variable and intermittent, from short (sub-hourly) to long (seasonal and interannual) timescales 18 . In order to address this challenge, increased research attention has been given to approaches that investigate integrated RE storage options, to thereby exploit the complementary spatiotemporal properties of RE resources 19,20 .
Various databases on renewable power plants have been published. On a global scale, the Global Energy Observatory 21 , the Open Infrastructure Map 22 , and the Global Power Plant Database 23 provide georeferenced information on fossil fuel and renewable power plants. Up to now, however, these open source databases lack information for Africa, in particular in the fast-developing domains of solar and wind power. The Global Dam and Reservoir Database (GranD) and the Future Hydropower and Reservoir Database (FHReD) are frequently cited as established databases reporting existing and future HPPs 24,25 ; yet only HPPs operating with a dam and a reservoir are included. Published in 2021, the African Hydropower Atlas (AHA) presents a harmonized dataset on existing and planned hydropower plants to facilitate modelling of power systems across Africa 26 . At the same time, its restriction to hydropower plants limits renewable power plant modelling. In order to implement integrated modelling approaches on the (potential) electricity mix, and its implications in Africa, the African Energy Live Data database, provided by the African Energy company, has been increasingly used as a reliable source by the science community in the past [27][28][29] . However, the London based consultancy company only provides small shares of their data to the public and charges for further downloads for analysis processing 28 . The Wind Power is one of a few global databases that provides information on existing and proposed wind farms in Africa, but similar to African Energy Live Data, only parts of it are freely accessible 30 . For solar power plants the Wiki Solar database 31 provides a similar service and covers globally more than 10,000 power plants. Again, data usage and replication are restricted and not available under a creative common license. The scientific use of renewable energy datasets without creative common license inflicts with the need that research published in scientific journals and including accessible datasets is reproducible. This lack of harmonized, open-access, and reliable datasets with georeferenced information on existing and proposed HPPs, solar power plants (SPPs), and wind power plants (WPPs) limits ongoing research efforts on the sustainable development of the energy resource mix and constrains a science-based discussion among stakeholders in the decision process.
In summary, two main approaches are currently used for renewable energy analyses on a continental or global scale: (1) Analyses are performed for one RE type, using a corresponding database 32 or (2) integrated analyses for different RE types are performed and data is compiled from various databases 33,34 or databases are behind paywalls 27 . The lack of a comprehensive and up-to-date database covering comparable information of existing and proposed HPPs, SPPs, and WPPs limits integrated renewable energy planning worldwide, in particular for Africa.
Here, a comprehensive, curated and georeferenced renewable power plant database for Africa (RePP Africa) is presented (last revision: 16.11.2022) 35 . Data records were compiled and processed from various sources for all African countries. The establishment of the database included four steps: compilation, georeferencing, completion, and revision ( Fig. 1).
The first openly-accessible and harmonized renewable power plant database covering entire Africa includes georeferenced information on a total of 1074 HPPs, 1128 SPPs, and 276 WPPs. 401 HPPs, 411 SPPs, and 127 WPPs are existing or under construction, with a total capacity of 59.56 gigawatts (GW), 10.56 GW, and 10.53 GW respectively ( Table 1). As of November 2022, 673 HPPs, 717 SPPs, and 149 WPPs are proposed with a total respective capacity of 130.85 GW, 53.32 GW, and 16.87 GW.
RePP lists three types of power plant facility status (status_inf): existing (E), under construction (U), and proposed (P). Proposed plants include potential sites where feasibility studies were realized. Once the construction has started, the status changes to under construction. If a plant is officially inaugurated, its status turns to existing. Each data entry is provided with a time stamp indicating when the status was last checked.
Power plant facilities might exist but not generate electricity for uncertain time periods due to destruction or other reasons. However, the searched databases do not distinguish operating and temporarily not operating existing plants. In order to enable RePP Africa users to consider this issue when using RePP Africa data 35 , we include the status of electricity generation (status_ele). It gives information on the status of power generation and distinguishes between operating (O), under construction (U), proposed (P), and not operating (NO). Since in the last case, the infrastructure of the power plant facility is existing, not operating plants could be rehabilitated and operate again.
For HPPs and SPPs, different operating systems are distinguished. 446 HPPs operate with reservoir storage, 286 as run-of-river HPPs, and 17 HPPs with a pumped-storage system. No adequate information could be provided for 325 HPPs (30% of total), with 86% of these categorized as proposed (281 HPPs). Most SPPs (1072) are operating or proposed as photovoltaic (PV), 47 as concentrated solar power (CSP), and 9 as concentrator photovoltaics (CPV) type plants.
As of November 2022, all 55 African countries have installed or proposed energy generation capacity from RE resources. Solar power and wind power are playing an increasingly important role in the total RE resource mix, with shifts between installed and proposed total capacity differing among countries (Fig. 2).
The contemporary, curated database on renewable power plants (existing, under construction, proposed) in African countries will enable the research community to address and fill current research gaps and to advance integrated renewable energy modelling. Openly accessible data on renewable energy plants vary in quality www.nature.com/scientificdata www.nature.com/scientificdata/ among countries. The RePP Africa 35 intends to stimulate integrated research and large-scale assessments at a continental level as well as to foster case studies and research activities in data-poor regions of less-studied African countries.

Methods
The spatially-explicit, renewable powerplant database for Africa (RePP Africa 35 ) aims to advance existing efforts in the field of open-accessible renewable energy data. By harmonizing, reviewing, and updating information on power plants from established databases 21,23 and adding information from other sources, RePP Africa 35 is a coherent database on the current key technologies hydro-, solar and wind power for the African continent. RePP was created to respond to an increasing demand for datasets that allow integrated electricity planning and impact assessment modelling on renewable power technologies. RePP Africa 35 is composed of three sub-datasets, containing information for hydropower, solar power and wind power. This approach enables the user to apply the database either for a specific energy source or across RE sources. RePP Africa 35   www.nature.com/scientificdata www.nature.com/scientificdata/ Data collection started in March 2021. The last revision was finished in November 2022. Power plants scheduled to go into operation in 2022 are kept in the categories "under construction" or "proposed" if no source indicating inauguration or construction start is given. However, continuous revisions will be performed in order to update the database and to adjust information if necessary or justified.
compilation. Data was collected from a wide array of available information sources. First, data records on existing and proposed projects were compiled from available databases with creative common license (Fig. 1). Databases had to match the following criteria: (1) Coverage of the database is global or continental (Africa). Due to limited availability of freely accessible databases for wind and solar power plants we optionally accepted the ECOWAS Observatory for Renewable Energy and Energy Efficiency (ECOWREX) 36 covering West Africa.
(2) The database includes information on hydropower, solar power, or wind power plants: The database contains information on all RE resources or, in selected cases, only on a specific energy source. (3) The database includes www.nature.com/scientificdata www.nature.com/scientificdata/ precise geographic information on the power plant location, ideally coordinates (latitude, longitude), or is displayed in form of a freely-accessible map that indicates plant location. In the latter case, coordinates were manually derived using ArcGIS Pro 37 39 provides capacity values in megawatts only for existing plants, which was optionally accepted because AKP was the database with the highest number of data records for all renewable resources. Data entries were combined with further information found during the following revision steps.
Four databases were used to compile information of power plants for each renewable source: For hydropower plants we used information from the Global Reservoir and Dams Database v1.3 (GRanD) 24 , the Future Hydropower and Reservoir Database (FHReD) 25 , the African Hydropower Atlas v2.0 (AHA) 26 36 , and Open Infrastructure Map 22 . After collection and harmonizing power plant information for each resource in an Excel sheet, all data entries were revised using Google search engine (Revision 1). Entries of power plants for which sources prove that planning has been discontinued were deleted. If the database compilation revealed inconsistent information, we consulted further sources in order to assure providing a correct information for each plant. Therefore, we prioritized (1) information from individual, specific project reports and (2) information from up-to-date sources (referring to the current timestamp of the data entry). Inconsistent information was mainly found for proposed power plants, www.nature.com/scientificdata www.nature.com/scientificdata/ because feasibility studies and project proposals do not assure one capacity value to be installed but present different capacity values. For the attribute g_cap_mw (given capacity in megawatts (MW)), we selected the capacity indicated by the majority of sources and saved further information in the field "other capacity" (other_cap_mw). Table 2. Overview of attributes (metadata) provided in the African renewable power plant database (RePP Africa). Attribute (column name) and a description are given. For each renewable type (hydropower plant database (HPPD), solar power plant database (SPPD), and wind power plant database (WPPD)) the share of records that have a data entry for this attribute is indicated as percentage [%] of the total number of records (i.e., 100% indicates that the attribute is given for all plants). If attributes are not available for all data records (<100%), the relative proportion of available records is given in percent [%] for all plants and for existing and under construction/proposed plants separately (in brackets). Total number of records:

Total of available data records in percent [%]
(Existing (E)/under construction (U)/proposed (P)) www.nature.com/scientificdata www.nature.com/scientificdata/ Additional power plants found during revision 1 in other literature sources matching criteria (3), (4), and (5) were added. Only openly-accessible information from databases with creative common license was included. All licences that apply to the searched databases are listed in RePP Africa 35 . www.nature.com/scientificdata www.nature.com/scientificdata/ Georeferencing. All plants with given coordinates were imported to ArcGIS Pro 37 and QGIS 38 software. All plants with locations indicated or described by reports or other sources were added manually and coordinates checked for plausibility. Existing plant locations were cross-checked using Google Maps and Open Street Map and, if necessary, corrected. All HPPs were snapped to river lines of the RiverATLAS (HydroATLAS) 35 21 . The database Power Africa from the African Development was excluded during last revision in November 2022, because the database is no longer accessible due to unknown reasons. Table 2 summarizes all available attributes and the relative proportion [%] of plants for which the information is available. In addition, RePP Africa 35 indicates for each plant if it is listed in one of the consulted databases and specifies further used sources. Such additional data sources can be divided into three categories: (1) summary reports and books that list power plants on a regional/national scale including all three power types or on a continental/global level for a specific power source; (2) information pertaining to individual projects, from technical project fact sheets, environmental impact assessments, peer-reviewed papers, and project summaries or presentations from engineering companies; and (3) online newspaper articles and social media announcements.

Final revision. Each plant record was revised by the authors and specifically checked on plant existence or
proposal. To increase data quality and exclude outdated and incorrect data records, each entry is confirmed by two or more references. In order to assure the latter, we consulted sources found via Google search engine and freely accessible data from African Energy Live Data 28 , The Wind Power 30 , and Wiki-Solar 31 . The consultation of these databases was in aggreement with their respective terms of use: No information was downloaded or copied and only freely accessible infromation was consulted to confirm information collated for RePP Africa 35 . Collected data records were excluded when (1) a second source was not found during final revision or (2) when the proposed plant was irreversibly cancelled. Date of last revision and edit is indicated for each plant and for the complete database. Last revision of the complete current version was November 2022. Just before RePP Africa 35 database submission, 5% of all database entries were randomly selected (49 HPP, 67 SPP, and 8 WPP power plants) www.nature.com/scientificdata www.nature.com/scientificdata/ and carefully checked again against the defined selection criteria as mentioned above to provide a most updated database version and get an idea about the potential error in the data (e.g. due to typos or invalid links to online references). This final check was conducted between November 16 and 17, 2022, and did not show any errors in the selected entries.

Data Records
RePP Africa 35 is collated in a single spreadsheet-based file and consists of the hydropower plant database (HPPD), solar power plant database (SPPD), and wind power plant database (WPPD). Figure 3 maps all data records according to resource type (symbol colour, symbol shape) and capacity, i.e. size in megawatts (symbol size).
RePP Africa 35 provides data records for hydropower, solar power, and wind power plants in all African countries (Fig. 3). The database is hosted on figshare 35 . The repository includes one Excel file with eleven sheets containing one information sheet on the general structure of the file and the sheets included (Info), one overarching sheet with metadata (S1) and, for each of the three RE resources (hydro, solar-, wind power), three specific sheets that provide (1) the RE specific metadata (S2, S5, S8), (2) the respective dataset (S3, S6, S9), and (3) the data sources (S4, S7, S10). The overarching table with metadata gives an overview on all attributes including descriptions of attributes and the number of plants for which an attribute is reported ( Table 2). Figure 4 summarizes the RePP Africa 35 data entries were further aggregated for each resource type in four capacity size categories to illustrate differences between share of capacity and number of existing and proposed power plants (Fig. 5). We differentiated between small (1-10 MW), medium (>10-100 MW), large (>100-1000 MW), and very large (>1000 MW) power plants.
During the first step of data compilation (Fig. 1), several plants of the searched databases were rejected and not compiled due to varying reasons: (1) The capacity was below 1 MW.  Table 3 gives on overview on the number of plants that were disregarded after the first search of the databases.

technical Validation
RePP Africa 35 compiles and revises existing data on hydropower, solar power and wind power for the entire African continent. The presented openly accessible and curated database is an attempt to meet the demand of the renewable energy modelling science community for free, accurate, and harmonized scientific data. All data and related information have been checked several times and are confirmed by at least two references each. Limitations in the accuracy of locations and other attributes might occur in particular for proposed power  . of plants  868 634 200 66 318 120 51 263 138 41 14 35 Total No of plants  8 68 634 200 66 318 120 51 263 138 41 14 35   Algeria  8  2  0  1 15 8  -3  1  --1 Madagascar  11 0  -1  1  1  -2  1  ---Angola  27 2  1  0  ---1  ----Malawi  5  1  0  ----0  ----Benin  1  2  0  0  --4  0  ----Mali  3  2  2  -3  -1  4  1 - www.nature.com/scientificdata www.nature.com/scientificdata/ plants. We performed a technical validation for RePP Africa 35 by randomly selecting 5% of all data entries after the last full revision, which ended on 16 November 2022. The random selection resulted in a dataset of 128 data entries (49 hydro-, 67 solar, and 8 wind power plants). We checked all attributes and sources. All data entries where consistent with the all or the majority of provided sources. We did not find inconsistent or missing data entries or non-functioning links. Each reported status (stat_inf) was in line with given sources and all indicated databases were correct. For 6%, minor discrepancies were found (duplicate source (1), one of all sources refers to a different plant (5), majority of sources indicates a slightly different capacity (2). However, despite these discrepancies, the majority of sources still confirmed the given information and in case of differing capacity values the difference was acceptable (<5%).
Outstanding in comparison to other databases is the number of references cited for each data record: 36% of all hydro-, 77% of all solar, and 59% of all wind power plant records are validated by two or three sources, the rest is validated by up to 15 references (Table 4).
In total, 1172 different references are provided for HPPs, 1134 for SPPs, and 432 for WPPs. 33% was collected from website and newspaper articles, 20% from development or environmental reports, company or government power point presentations, and fact-sheets, 16% from other databases, 14% from company websites, and 10% from Encyclopaedia records. Less frequently, information from peer-reviewed articles (3%), UN, development bank, and government websites (2%), social media (1%), and books or theses (1%) was obtained.
Compiled total numbers and capacities of the different renewable power plant types have been cross-checked with available cumulated continent-related data ( Table 3).   (Table 5). It highlights that we could enhance quality and completeness of RePP Africa 35 by not only searching existing databases but also including plants from additional sources (compilation, revision 1). The latter applies in particular to wind and solar power plants. In general, the cumulated capacities of existing power plants included in the RePP Africa 35 differ by minimum + 3.07% (hydropower) and maximum −31.82% (solar power) from data reported by IHA, IEA, IRENA, and Hydropower & DAMS 44 for 2020 and 2021 (Table 5, Table 5. Comparison of data from all searched databases (compilation databases) to the here presented African Renewable Power Plant Database (RePP Africa) and to other established data sources for renewable energy assessment (validation databases, year of census in brackets). Number of plants and cumulated capacities for the whole African continent are given in gigawatts (GW) for the specific renewable resources (hydropower, solar power, wind power) and the specific construction status (existing (E), under construction (U), proposed (P)). www.nature.com/scientificdata www.nature.com/scientificdata/ HPPD: +3.07% (IHA 2021), +3.98% (IRENA 2021), +12.81% (H&D 2020); SPPD: −27.31% (IEA 2021), −31.82% (IRENA 2021); WPPD: +12.86% (IEA 2021) +23.19 (IRENA 2021)). The largest discrepancy is between reported cumulated capacity of proposed HPPs with 132.05 GW according to the here presented database (HPPD) and 49-115 GW in the available literature (Hydropower & DAMS). The following reasons may be explanations for the discrepancies in the hydropower data: (1) The HPPD contains data records on HPPs not covered by IHA or Hydropower & DAMS; (2) The HPPD includes up-to-date data (2021) which is so far not included in IHA or Hydropower & DAMS datasets from 2020; (3) The implementation of the proposed HPP is subject to uncertainty. Different organisations use different definitions when including proposed or planned power plants in cumulative capacity calculations. Underestimation of the cumulated solar power capacity in RePP Africa 35 in comparison to IEA and IRENA might result from neglecting all solar power plants <1 MW. Another discrepancy comes up when taking a look at the details of the different solar power plant types. IEA further differentiates between solar power from photovoltaics (PV) and concentrated solar power (CSP). According to them, in 2021, 9.4 GW is installed as PV and 1.4 as CSP versus 7 GW as PV and 3 GW as CSP, respectively, in the SPPD of RePP Africa 35 .
At present, wind and solar energy outpace hydropower and other renewable sectors with their growth rates. Although the IEA provides forecasts for cumulative capacities for 2021 and 2022, these analyses are based on data from 2020, making the presented database RePP Africa 35 the most up-to-date open-access database on renewable power plants in Africa.

code availability
All processing steps including data compilation and georeferencing were realized with ArcGIS Pro 2.9.5 software from ESRI 37 . The open-source software QGIS version 3.18 was used by assistants as additional software to georeferenced power plant locations 38 . We used the ArcGIS 'Edit -Create' function and the QGIS ' Add feature' function to manually assign power plant locations in cases where maps but no coordinates were accessible. Additional information is described in detail in the Material and Methods section. No stand-alone programming code was created.